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Abstract 

We present a least squares method for estimating parameters from measurements of event yields 
in the presence of background and crossfeed. We adopt a unified approach to incorporating the 
statistical and systematic uncertainties on the experimental measurements input to the fit. We 
demonstrate this method with a fit for absolute hadronic D meson branching fractions, measured 
in e + e~ — > -0(3770) — > DD transitions. 
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I. INTRODUCTION 



Least squares fitting is a well-known and powerful method for combining information 
from a set of related experimental measurements to estimate the underlying theoretical 
parameters (see, for instance, Reference [1]). We discuss a specific implementation of this 
method for use in high-energy physics experiments, where the free parameters, denoted by 
the vector m, are extracted from event yields for signal processes. Typically, these yields 
are subject to corrections for background, crossfeed, and efficiency Because the sizes of 
these corrections depend on the values of the free parameters, we make all yield adjustments 
directly in the fit. Often, the uncertainties on these corrections are ignored during the fit and 
are propagated to the free parameters afterwards. However, if these uncertainties modify 
the relative weights of the measurements, then the above two-step procedure would bias 
both the fitted central values and the estimated uncertainties. Therefore, we build the x 2 
variable from a full description of the uncertainties, statistical and systematic, as well as 
their correlations, on both the yields and their corrections. Thus, the input measurements 
- event yields, signal efficiencies, parameters quantifying the background processes, and 
background efficiencies — and their uncertainties are all treated in a uniform fashion. In 
the x 2 minimization, we account for the m dependence of the yield corrections. 

II. FORMALISM 

Below, we denote matrices by upper case bold letters and one-dimensional vectors by 
lower case bold letters. Let n represent a set of N event yield measurements, each for 
a different signal process. Each measurement may receive crossfeed contributions from 
other signal processes as well as backgrounds from non-signal sources. The background 
processes are described by b, a vector of B estimated production yields, which can be 
functions of experimentally measured quantities, such as branching fractions, cross sections, 
and luminosities. In principle, the free parameters m can also appear in b, although no 
additional degrees of freedom are introduced by b. The rates at which these background 
processes contaminate the signal yields are given by the N x B background efficiency matrix, 
F. Thus, the vector s = n — Fb represents the background-subtracted yields. 

We use an NxN signal efficiency matrix, E, to describe simultaneously detection efficien- 
cies (diagonal elements) and crossfeed probabilities (off-diagonal elements). The elements 
Eij are defined to be the probabilities that an event of signal process j is reconstructed and 
counted in yield i. The corrected yields, denoted by c, are obtained by acting on s with the 
inverse of E: 



Thus, c encapsulates all the experimental measurements. The variance matrix of c, denoted 
by V c , receives contributions, both statistical and systematic, from each element of n, b, 
E, and F. 

In the least squares fit, we define x 2 = (c — c) T V^ 1 (c — c), where c is the vector of 
predicted yields, which are also functions of m. Because both c and c (through b) depend 
on m, minimizing this x 2 amounts to a nonlinear version of the total least squares method [2] . 
We solve this problem by extending the conventional least squares fit to include contributions 
from both c and c in dx 2 /dm. Given a set of seed values, m , the optimized estimate, m, 




(1) 
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and its variance matrix, V m , are 

m = m + (DV C - 1 D T ) 1 DV C 1 [c(m ) - c(m )] (2) 

v = 1 ^x 2 = fpy-^^r 1 (3) 

where the M x N derivative matrix D is defined to be 



_ dc dc dc <9b rp / , \ t 

D = - — = — + — F T (E^ 1 ) . 4 

am am am am v ' 

In general, c and c are nonlinear functions of m, so the linearized solution m is approximate, 
and the above procedure is iterated until the x 2 converges. Between iterations, all the fit 
inputs that depend on m are reevaluated with the updated values of m. 

Nonlinearities also occur when V c contains multiplicative or Poisson uncertainties that 
depend on the measurement values. With the least squares method, these nonlinearities 
result in biased estimators unless these variable uncertainties are evaluated using the pre- 
dicted yields c instead of the measured c. Therefore, all three ingredients in the x 2 — c ) c, 
and V c — are functions of m. However, we do not include the derivatives oV c /om in D 
because doing so would generate biases in m. 

For a simple demonstration of the aforementioned biases, we consider two measured yields, 
Ci and C2, which are both estimators of a true yield c. We assume that the uncertainties on 
Ci and c 2 are uncorrected, multiplicative, and of the same fractional size, A. We construct 
an improved estimator, c, by minimizing x 2 — ( c i — c )V°ci + ( c 2 ~~ c ) 2 /cr 2 2 with respect to 
c. If, following the prescription given above, we neglect the daljdc terms in dx 2 /dc and 
assign (iteratively) the uncertainties a Cl = a C2 = Ac, then ci and c 2 are equally weighted, 
and c is an unbiased estimate of c: 

_ Ci + c 2 , s 

Cunbiased ^ \"/ 

2 2 / Ci -c 2 \ 2 

^unbiased A 2 Vci + cJ ' W 

On the other hand, including the daljdc terms in dx 2 '/dc results in an upward bias: 

C l + C 2 _ - (-, , ^Xunbiascd \ (~s 

^biasedl , ^unbiased I 1 t I I I ) 

Cl + C 2 \ 2 J 

2 ( C l ~ C 2) 2 i q\ 

Xbiasedl A 2 (cf + C|)' {) 

Finally, if we assign uncertainties based on the measured yields, not the predicted yields, 
such that a ci = Aci, a C2 = Ac 2 , and daljdc = 0, then the resulting estimate is biased low: 

ClC 2 (Ci + C 2 ) _ ^ , , 2 2 v /q\ 

Cbiascd2 — 2 I ~2 ~~ Cunbiased^ * Xbiasedl J W) 

C 1 -f- C 2 

Xbiascd2 Xbiasedl " 

(10) 

Thus, even though xLscdi and Xbiascd2 are smaller than ^unbiased, the corresponding estima- 
tors possess undesired properties. 
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III. INPUT VARIANCE MATRIX 



The uncertainties on the N elements of n and the B elements of b are characterized by 
the N x N matrix V n and the B x B matrix V b , respectively. Usually, the elements of 
E and F share many common correlated systematic uncertainties, so we construct a joint 
variance matrix from the sub matrices Ve, Vp, and Cef, where Vp (N 2 x iV 2 ) and Vp 
(NB x NB) are the variance matrices for the elements of E and F, respectively, and Cpp 
(iV 2 x NB) contains the correlations between E and F. Below, we label each element of E 
or F by two indices (Eij or Fij), and the two dimensions of E or F are mapped onto one 
dimension of Vp or Vp. 

We form V c by propagating the statistical and systematic uncertainties on n, b, E, and 
F to c via 



dc 
dn 



r 



V, 



* ■ dc v b ^ + ( ( ac/aEf (ac/OTf)(£- 



dn dh 



'db 



EF 




Where appropriate, we substitute c for c, as discussed in Section II. The first term of 
Equation 11 is simply E" 1 V n (E~ 1 ) T , and the second term is E~ 1 FV b F T (E" 1 ) T . For the 
third term, we evaluate the partial derivatives and find 



dc_ 
dE 
8c 
dF 



'dE- 



= s 



dE 



= — s 



(^(D (E-') T =-A(E-) : 



¥- = -b(e-M t , 



(12) 
(13) 



where A = c T (dE/dE) T and B = b T (<9F/<9F) T , with elements given in terms of the Kro- 
necker delta (<%): dEki/dEij = dFki/dFij = SikSji. The matrices A and B have rows labeled 
by two indices, which refer to the elements of E and F, respectively, and columns labeled 
by one index, which refers to the elements of c. In other words, the ij-th row of A is given 
by c T (dE/dE ij ) T , where (dE/dE i:j ) kl = dE k i/dE i:j . Therefore, the elements of A and B are 
Aij,k = 8ikCj and Bij k = b ik bj. For N = B = 2, these matrices are 



fci \ 

c 2 

ci 

V c 2 J 



and 



B 



fh \ 
h o 

6i 



(14) 



This treatment of error propagation in matrix inversion agrees with that derived in Refer- 
ence [3]. The above relations allow us to reexpress V c as 



V c = E- 1 V An (E- 1 ) 7 , 



(15) 



where V An = V n +FV b F T +A T V E A+B T VFB+A T CEpB+B T Ci; F A. As a result, we have 
X 2 = An T V A ^An, where An = n — Ec — Fb. Thus, the x 2 minimization can be formulated 

equivalently in terms of n instead of c: V m = (D'V A ^D /T ) and m = mo+VmD'V^An, 
where D' = DE T = (dc/dm)E T + (<9b/<9m)F T . 

Systematic uncertainties on the efficiencies are often multiplicative and belong to one 
of three categories: those that depend only on the reconstructed mode (row-wise), those 
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that depend only on the generated mode (column-wise), and those that are uncorrected 
among elements of E and F. For row-wise efficiency uncertainties, all the elements in any 
given row of E and F have the same fractional uncertainty, which we denote by Aj = 
UEij/Eij = cr F /Fy. The correlation coefficients between elements of different rows are 
\^j/(\Xj), where \j characterizes the uncertainties common to q and Cj. For instance, if 
Atrack is the fractional uncertainty associated with the charged particle tracking efficiency, 
then Aj = tiA trac k and A?- = titj\ 2 rack , where t« and i,- are the track multiplicities in modes i 
and j, respectively. Note that \u = Aj. Similarly, for column-wise uncertainties, we define 
the fractional uncertainties fij = UE^jE^ = ap^/F^ and correlation coefficients / (/li/ij) . 
We denote the uncorrelated fractional uncertainty on any element of E or F by v ik j\. Table I 
gives expressions for the elements of V E , V F , and Cefj as well as their contributions to V c 
for row-wise, column-wise, and uncorrelated uncertainties. 

TABLE I: Expressions for the elements of Ve, Vp, and Cef, as well as their contributions to V c . 
Repeated external indices are not summed over. 



Quantity 


Row-wise 


Column-wise 


Uncorrelated 


(VE)ifc,j7 




f4i E ikEji 


V< ik,il E ikEjl5ij5 k i 


(V F )jjfcj; 


^ij F ik F jl 


f4l F ik F jl 


V ik,jl F ik F jAj5kl 


(C E F)ik,jl 




VkiEikFji 





(A T V E A) ij 




Li k iEi k c k Ejici 


X 2 ~2 
d ij a E jk c k 


(B T V F B) tJ 


XfjFikbkFjibi n 2 kl Fi k bkFjibi 




(A T C EF B) ij 


X%SiFj k b k 


fJ- k iE ik c k Fjibi 






IV. EXAMPLE: HADRONIC D MESON BRANCHING FRACTIONS 

The least squares method described in the previous sections has been employed by the 
CLEO-c collaboration [4] to measure absolute branching fractions for hadronic D meson 
decays. Using DD pairs produced through the ^(3770) resonance, the branching fraction 
for mode i, denoted by £>j, is measured by comparing the number of events where a single 
D — > % decay is reconstructed (called single tag, denoted by Xi) with the number of events 
where both D and D are reconstructed via D — > i and D — > j (called double tag, denoted 
by yij). These yield measurements form the vector n. The free parameters m are the Bi and 
the numbers of D°D° and D + D~ pairs produced, denoted by A/" 00 and Af + ~, respectively, 
and denoted generically by N '. Yields for charge conjugate modes are measured separately, 
so the predicted corrected yields c are MB{ for single tags and AfBiBj for double tags. Thus, 
Bi and N can be extracted from various products and ratios of Xi, Xj, and y^-: Bi ~ Uij/xj, 
Af ~ XiXj/i/ij, up to corrections for efficiency, crossfeed, and background. 

The matrix V n describes the statistical uncertainties and correlations among the Xi and 
yij. The are uncorrelated, but because any given event can contain both single tag and 
double tag candidates, the Xi are correlated among themselves as well as with the y^. If 
the selection criteria for single and double tags are the same, then the events (signal and 
background) used to estimate y^ are a proper subset of those for Xi and Xj. Thus, any 
single tag yield is a sum of exclusive single tags (a;f cl ) and double tags: x^jy = + yij- 
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CT Xj + 



Propagating the uncertainties on the independent variables, x\ , Xj, and y^, gives the 
following elements for V n : 

(AxiAxj) = 
(AyijAyki) = 
(AxiAy jk ) = 

where Ax { = x, L - (xi), Ay tj = y {j - (y t 



+ 5 ik )a 2 



Vjk- 



and a 2 , . = a 2 x + a 2 



(16) 
(17) 
(18) 

n 3 n — » X{iJ} - , Thus, for any two 

single tag yields and the corresponding double tag yield, the three off-diagonal elements of 
V n are all given by the uncertainty on the number of overlapping events. In addition to 
these statistical uncertainties, V n can also receive contributions from additive systematic 
uncertainties. 

Some of the sources of background we consider are non-signal D decays, e + e~ 



events, and e 



-> qq 

t + t~ events. If there are two non-signal D backgrounds with branching 



fractions C\ and C 2 , then the vector b is given by 



A/VJi 

cx, 



qq 



(19) 



V cx T+T - ) 

are the cross sections for qq and t + t~ 



where X q q and X T + T - are the cross sections for qq and t + t~ production, respectively, and 
C is the integrated luminosity of the data sample. Because of the non-signal D decays, the 
free parameter M appears in b but does not contribute any additional terms to the variance 
matrix Vb, which takes the following block diagonal form: 



V h = 



V 



o 








N 2 a 2 





C-2 








a 



x 



X, 



qq 



Xl-a 



2 

qq~C 



XggX T 



C 2 a 2 








a. 



\ 



(20) 



Also, the matrix (9b/<9m is nontrivial and is incorporated into the x 2 minimization. 

In the joint variance matrix for E and F, uncertainties of all three types discussed in 
Section III are present. Row-wise effects arise from systematic uncertainties on simulated 
reconstruction efficiencies for charged tracks, 7r° — > 77 decays, Kg — > it + it~ decays, and par- 
ticle identification (PID) for charged pions and kaons. Column-wise uncertainties reflect the 
poorly known resonant substructure in multi-body final states. Uncorrelated contributions 
come from statistical uncertainties due to the finite Monte Carlo (MC) simulated samples 



used to determine E and F. Thus, for example, if mode % is D° 
D + — > Kg-n + , then the row- wise uncertainties are given by 

A 



K 7r + 7r° and mode j is 



2 

i 
2 



(2A tr ack) 2 + A^o 



+ ^±PID + ^±PID 



A 



— (3A trac k) 2 + A^ipjD 
= 6A t 2 



(21) 
(22) 

track ' ^TriPID' 

(23) 

Because these row-wise and column-wise uncertainties are completely correlated among the 
yields to which they pertain, they degrade the precision of Bi but not Af. Furthermore, 
they have no effect on the central values of m because the relative weight of each yield is 
unaltered by these uncertainties. However, they can introduce large systematic correlations 
among the fit parameters, even between statistically independent branching fractions of 
different charge. 
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No. of Standard Dev Confidence Level 

FIG. 1: Toy MC fit pull distributions for A/" 00 (a), B(D° -> K~Tr + ) (b), B(D° -► i^-vr+vr ) (c), 
B(D° -» ^-7r+vr-vr+) (d), (e), £(D+ -» ^-vr+7r+) (f), and £(L>+ -► K§tt + ) (g), overlaid 

with Gaussian curves with zero mean and unit width. The fit confidence level distribution (h) is 
overlaid with a line with zero slope. 

A. Toy Monte Carlo Study 

We test the method presented above using a toy MC simulation with Gaussian smearing 
of the fit inputs. We generate data for five decay modes, D° — > K~n + , D° — > K~tt + 7t°, 
D° — > K~7t + 7t~7t + , D + — > K~7t + tt + , and D + — > fT^7r + (charge conjugate particles are 
implied), for which there are ten single tag and thirteen double tag yields. The fit determines 
seven free parameters: A/" 00 , A/" +_ , and five charge-averaged branching fractions. The input 
branching fractions are taken to be the world-average values given in Reference [I], and we 
use A/" 00 = 2.0 x 10 5 and N + ~ = 1.5 x 10 5 . The efficiencies are mode-dependent: 30%- 
70% for single tags and 10%-50% for double tags, with fractional statistical uncertainties 
of 0.5%-1.0%. The yield uncertainties are specified to be close to the Poisson limit, and 
backgrounds correspond roughly to those expected in 60 pb _1 of e + e~ collisions at the 
■0(3770). Also, we apply correlated systematic efficiency uncertainties of 1% for tracking, 
2% for 7T° reconstruction, 2% for Kg reconstruction, and 1% for charged pion and kaon PID. 

The fit reproduces the input parameters well. Figure 1 shows the pull distributions for 
the seven fit parameters and the fit confidence level for 10000 toy MC trials. All the pull 
distributions are unbiased and have widths consistent with unity. Also, the confidence level 
is flat. Table II gives the correlation coefficients among the fit parameters. Branching 
fractions tend to be positively correlated with each other and negatively correlated with 
A/" 00 and A/" + ~. In particular, the D° branching fractions are correlated with those for D + . 
In the absence of correlated efficiency uncertainties, the D° and D + free parameters would 
essentially be independent. 

Slight asymmetries can be observed in the pull distributions, especially in those for A/" 00 
and A/' H . These asymmetries are caused by the nonlinear nature of the multiplicative 
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TABLE II: Correlation coefficients, including systematic uncertainties, for the free parameters 
determined by the fit to toy MC samples. 







K--K+ 




K 1T + 1T 7T + 




K 7T + 7T + 




J\f 00 




1 


-0.63 


-0.52 


-0.38 


-0.01 


-0.01 


-0.01 


K-TT' 






1 


0.79 


0.87 


-0.01 


0.40 


0.29 




-vr 






1 


0.77 


-0.01 


0.37 


0.27 




~7T~7T + 








1 


-0.01 


0.53 


0.39 


AA+- 












1 


-0.82 


-0.77 




-7T+ 












1 


0.87 


















1 



efficiency uncertainties and of the functions c(m). Because the fit parameters are effectively 
estimated from ratios of the input yields, Gaussian fluctuations in the denominators produce 
non-Gaussian fluctuations in the ratios, which are most visible in A/" 00 and A/" + ~, where the 
uncertainties in the denominators are dominant. Similarly, multiplicative uncertainties, 
which affect only the branching fractions, scale with the fitted values and, therefore, give 
rise to asymmetric B pulls. In both cases, larger fractional uncertainties would heighten the 
asymmetries. 

If we form the matrix A in Equation 14 using the measured yields c rather than the 
predicted yields c, then the variance matrix V c need not be reevaluated after each fit itera- 
tion. However, in this case, the pull distributions become significantly biased, as shown in 
Figure 2. Thus, obtaining unbiased fit results and the correct uncertainties requires proper 
handling of the efficiency variance matrices Ve and Vf- 

V. SUMMARY 

We have developed a least squares fit that simultaneously incorporates statistical and 
systematic uncertainties, as well as their correlations, on all the input experimental mea- 
surements. Biases from nonlinearities are reduced by introducing fit parameter dependence 
in the input variance matrix. This fitting method is used to measure absolute branching 
fractions of hadronic D meson decays, and toy Monte Carlo studies validate the performance 
of the fitter. By including all known sources of measurement uncertainty in the x 2 , we obtain 
unbiased fit parameters with correct estimated uncertainties. 
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FIG. 2: Toy MC fit pull distributions, with V c calculated using c instead of c, for M 00 (a), B(D° -> 
K-TT+) (b), B(D° -» K-tt+tt ) (c), B(D° -» ^-7r+7r-vr+) (d), (e), -» if-7r+7r+) (f), 

and B(D + — > K^7r + ) (g), overlaid with Gaussian curves with zero mean and unit width. The fit 
confidence level distribution (h) is overlaid with a line with zero slope. 
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